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.Most  arUclcs  in  the  {.ASSIST  Quarterly  stems  from  presenlations 
given  at  session  at  the  yea  rIylASSlST  conference.  This  vol.  24-4  issue 
of  the  IQ  shows  that  other  articles  are  very  welcome  as  well.  Mary  J. 
Loo  from  the  University  of  Notre  Dame  gives  us  an  introduction  to  an 
cnvironmenlal  database  the  "To.xics  Release  Inventory"  and  this 
proioct  has  not  boon  presented  at  I  ASSIST  earlier.  You  can  have  a  look 
at  the  database  at  the  vvcb-site:  http:,- .■www.opa.gov/tri.  Secondly. 
Alejandro  Delgado-Gomez  from  Cartagena  City  Council-.Archives 
had  intended  to  panicipaie  in  the  .Xmslerdam  200 1  LASSIST  conference. 
This  plan  was  not  realized,  he  was  however  able  to  send  his  paper  to 
the  oonfercnco.  The  paper  addresses  the  subieet  of  slorago  and 
dissemination  of  cultural  heritage  at  a  local  level  m  the  form  of  the 
eonipicx  area  of  oily  planning.  Lastly.  Robert  Striugcn  from  the 
German  Social  Science  Information  Centre  (IZ)  m  Bonn  and  Rolf 
Uher  at  Central  .Archiv  e  for  Empirical  Social  Science  Research  tZ.-\) 
at  the  University  of  Cologne  made  a  presentation  and  contributed  to 
the  sueoos  of  the  .Amsterdam  2001  conference.  Their  oonoom  is 
computer  assisted  merging  and  archiving  of  distributed  international 
comparalivo  data.  The  ISSP  DalaWizard  is  a  software  loo!  facilitating 
these  functions  fortho  liuomational  Social  Survey  Programme.  If  you 
did  nol  make  it  to  ihc  .Amsterdam  200 1  conference  or  you  want  lo  have 
u  look  at  the  presentations  from  the  conference  you  should  take  a  look 
at  iho  LASSIST  web-sito.  Your  editor  is  collecting  both  presentations 
and  papers  from  the  conference.  The  papers  will  appear  in  this 
newsletter  while  the  presentations  will  be  available  for  view  at  the 
LASSIST  vveh-site  (http:  wavw.iassistdata.oig). 
Kaislen  Boye  RasmiLsscn  -  .AugiLst2(Xll 


Toxics  Release  Inventory  -  An 
Environmental  Database 


Data  Background/Description 

This  paper  dcscnbcs  a  publicly  available 
database  called  the  Toxics  Release 
Inventory  (TRI).  which  provides  a 
valuable  data  source  of  environmental 
infonnation.  The  TRI  data  are  machine- 
readable  microdata  at  the  industrial 

facility  level  and  are  provided  tree  with  ^g^gggggg^ 

unlimited  access  to  the  public.  The  data 
collection  approach  for  this  database  is  innovative  because 
it  uses  the  information  collection  pro\'ision  as  the 
regulatoiy  instrument. 

Following  a  chemical-release  accident  in  Bhopal,  India,  the 
U.S.  Congress  passed  the  Emergency  Planning  and 
Community  Right-to-Know  Act  (EPCRA)  in  1986.  Under 
these  provisions,  manulaeluring  facilities  with  10  or  more 
employees  in  Standard  Industrial  Classification  (SIC)  codes 
20  through  39  are  required  to  publicly  disclose  their  annual 
to.xic  release  to  air,  water,  and  land  as  well  as  off-site 
transfers.  The  U.S.  Environmental  Protection  Agency 
(EPA)  compiled  these  annual  repoils  into  the  TRI  database. 
Because  of  the  mandatory  requirement  of  data  provision 
and  its  inclusion  of  a  public's  right-to-know  provision,  this 
database  provides  a  reliable  source  of  environmental 
performance  information. 

Since  the  initial  data  release  in  1989,  the  number  of 
reporting  facilities  and  chemicals  has  been  increased. 
Seven  industrial  sectors  have  been  added  to  the  original 
reporting  manufacturing  industries.  These  include  electric 
utilities,  coal  mining,  metal  mining,  chemical  wholesalers, 
petroleum  bulk  plants  and  temiinals,  solvent  recovery  and 
hazardous  waste  treatment,  storage,  and  disposal.  New 
data  for  1998  were  released  m  2000  covering  seven 
industrial  sectors.  There  is  a  two-year  time  lag  in  the 
release  of  TRI  data.   For  example,  the  most  recent  data 
release  in  2000  is  for  the  reporting  year  1998. 

The  TRI  data  have  become  a  primai-y  source  of 
environmental  pertbrmance  information  for  a  broad  range 
of  user  groups.  These  include  social  scientists, 
environmentalists,  government  officials,  investors, 
consulting  firms.  Jounialists,  health  professionals,  etc.  The 
international  organizations  ha\  e  also  recently  ioined  the 
group  of  [Rl  users. 


hv  Mary  J.  Lee' 


Data  Disseniliiatioii/Search  Engines 

Data  Aecess 

Technological  progress  has  brought 
major  changes  in  data  management  and 
data  analysis  system.  The  data  storage 
and  data  access  are  much  easier  due  to 
the  faster  speed  of  personal  or  inainfraine 
mmH^^B     computers,  larger  data  storage  capacities 

and  the  development  of  the  Internet. 
Accessibility  and  media  format  options  for  the  TRI  have 
al.so  changed.   The  TRI  data  are  curremly  available  on 
fioppy  diskette.  CD-ROM,  or  through  the  Internet. 

The  fioppy  diskettes  contain  the  most  frequently  used  data 
elements  including  each  facility's  identification  numbers, 
county,  city,  state,  zip  code,  SIC  code,  parent  company 
name,  chemical  name  and  chemical  registry  number,  total 
releases  to  the  air.  water,  land,  underground  injection  and 
off-site  transfers.  They  also  include  the  longitude  and 
latitude  of  the  facility  and  Federal  Infonnation  Processing 
Standards  (PIPS)  code.  The  CD-ROM  edition  is  comprised 
of  two  CDs.  Disc  one  has  the  TRI  data  for  1987-1990. 
Disc  two  contains  data  for  1991-1996.  The  basic  features 
of  the  CD-ROM  include:  user  guide,  combining  searches 
using  Boolean  operators,  displaying  records,  exporting 
recordi)  in  several  formats,  creating  custom  reports  and 
calculating  the  data  using  KASTAT.  The  TRI  data  on  the 
Internet  is  available  for  the  time  span  of  1988-1998. 

TRI  Explorer 

The  TRI  E.xplorer  is  a  search  engine  that  provides  access  to 
the  TRI  data  on  the  Internet.  The  initial  version  of  the  TRI 
Explorer  included  on-  and  off-site  release  data.  The  latest 
version  added  waste  transfer  and  waste  management  data  to 
the  original  toxic  release  data.  This  search  engine  allows 
you  to  identify  facilities  and  their  chemical  release.  Data 
can  be  disseminated  into  release,  waste  transfer  and  waste 
quality  reports.   Data  can  be  grouped  according  to  fi\e 
criteria:  t'acility.  chemical,  year  or  industry  type  and 
geographic  area  at  the  county,  state  or  national  level. 
Waste  management  reports  include  recycling,  energy 
recovew.  treatment  as  well  as  off-site  waste  transfers.  A 
trends  report  option  is  also  available  for  the  core  chemicals. 
Metadata  on  the  web  provides  valuable  information 
including  detailed  data  element  descriptions.   Data 
elements  provide  both  facility  identification  information 
and  chemical-specific  information. 
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Other  TRl  Sites  on  the  Intefnet 

On-line  searching  for  the  TRI  data  is  available  using  the 
web  sites  such  as  Envirofacts;  Data  Warehouse  and 
Applications  and  the  National  Library  of  Medicine  (NLM) 
TOXNET  System.  The  Envirofacts  Warehouse  includes 
multiple  environmental  databases  that  allow  you  to  retrieve 
environmental  information  from  several  EPA  databases. 
Spatial  data  are  a\ailable  using  the  Maps  on  Demand 
applications.  TOXNET  (Toxicology  Data  Network)  is  a 
cluster  of  databases  on  toxicology,  hazardous  chemicals, 
and  other  related  environmental  or  public  health  areas. 

International  Development  Systems 

International  TRl-like  System 

There  has  been  a  growing  global  movement  for  information 
database  on  toxic  release.  International  TRl-like  system  is 
called  as  Pollutant  Release  and  Transfer  Registers 
(PRTRs).  International  organizations  started  some 
initiatives  to  implement  the  development  of  PRTRs.  The 
PRTRs  in  the  world  include;  Canada:  National  Pollutant 
Release  Inventory  (NPRl).  United  Kingdoin:  Pollutant 
In\entor}  (PI).  Mexico;  Registro  de  Emisiones  y 
Transferencia  de  Containinantes  (RETC).  Australia; 
National  Pollutant  Inventory  (NPI)  and  Czech  Republic; 
Pollutant  Release  and  Transfer  Register  ( PRTR).  Among 
three  North  American  PRTRs,  Canada  has  PRTR  data 
starting  1993.  Mexico  collects  PRTR  data  from  industrial 
facilities  on  a  voluntary  basis.  The  Commission  for 
Environmental  Cooperation  (CEC),  an  environmental 
organization  created  by  the  North  .American  Free  Trade 
Association  (NAFTA),  compiles  the  data  and  publishes  an 
annual  report  on  the  North  .American  PRTRs. 

Asian  countries  also  participated  in  this  international 
movement.  Japan  hosted  the  most  recent  niternational 
conference  on  PRTRs;  National  and  Global  Responsibility 
in  September  1998.   Indonesia  developed  similar  public 
disclosure  program  called  Program  for  Pollution  Control. 
Evaluation  and  Rating  (PROPER).   Indonesia's  National 
Pollution  Control  .Agency  initiated  this  pollution  control 
program  to  evaluate  the  environmental  perfonnance  of 
Indonesian  factories.  The  Philippines  followed  Indonesia's 
footsteps.  The  Philipjiines"  Department  of  En\ironment 
and  Natural  Resources  recently  started  a  public  disclosure 
program  called  EcoWatch  modeled  on  Indonesia's 
PROPER  program. 

International  Organization  PRTR  Sites 

The  international  movement  on  PRTRs  stems  from  the 
1992  Earth  Summit,  also  called  the  United  Nations 
Conference  on  En\ironment  and  Development  (UNCED). 
Several  international  organizations  now  have  their  own 
home  page  for  the  development  of  PRTRs;  (Organization 
for  Economic  Co-operation  and  Development  (OECD) 
PRTR  homepage.  United  Nations  En\  ironmeiital 


Programme  (UNEP)  PRTR  homepage,  UNITAR  PRTR 
homepage  and  World  Bank  PRTR  homepage. 

Further  Data  Usability 

Growing  international  interest  in  toxic  release  information 
presents  the  opportunity  to  combine  the  various  databases 
and  compare  each  country's  toxic  releases  and  waste 
management  activities.  It  also  provides  the  possibility  of 
developing  an  international  toxic  release  database  in  the 
future.  This  would  be  in  addition  to  the  existing  data 
archives  and  would  be  beneficial  both  to  academic 
researchers  and  government  policy  makers.  Wide  use  of 
this  database  will  also  provide  industries  with  the  incentive 
to  improve  existing  pollution  abatement  technology. 

In  addition  to  the  unified  international  database,  it  is 
suggested  that  a  comprehensive  database  may  be  developed 
using  existing  databa.ses  from  other  areas.  One  such  area  is 
public  health,  where  human  health  risks  could  be  measured 
using  the  TRI  and  other  health  information  databases. 
Other  health  information  databases  include  the  Hazard 
Infonnation  on  Toxic  Chemicals,  Integrated  Risk 
Infonnation  System  (IRIS),  and  ToxFAQs'^^  by  the 
.\gency  for  Toxic  Substances  and  Disease  Registry' 
(.ATSDR).  Another  areas  are  finance  and  economy,  where 
the  financial  effects  of  environmental  information  can  be 
assessed  using  the  financial  databases.  They  include  the 
Center  for  Research  in  Secunty  Prices  (CRSP)  database 
and  the  Standard  &  Poor's  Compustat  database. 

Conclusion 

The  Toxics  Release  Inventory  database  is  a  valuable 
resource  as  a  database  for  researchers  in  the  area  of 
environmental  studies,  health  and  business.  In  addition,  it 
provides  policy  makers  and  the  public  with  a  reliable 
source  of  information  on  toxic  emissions  and  serves  as  a 
regulatoiy  tool  for  the  management  of  industnal  pollutants. 
In  addition,  the  availability  of  TRI  data  along  with  similar 
efforts  in  other  countries  provides  an  incentive  for 
cooperative  efforts  in  international  reporting  and  analysis 
of  toxic  release  data. 

*  Mary  .1.  Lee,  945  Planner  Hall.  Laboratory  for  Social 
Research.  University  of  Notre  Dame.  Notre  Dame.  IN 
46556  U.S.A.  Tel  (219)  631-4521  E-mail;  Lee.82{«'nd.edu 
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Harmonising  Methods  of  Disseminating 
Urban  Heritage 


Introduction 

In  cities  with  a  long  histoi^'.  we  ean  often 

find  a  high  level  of  dispersion  of  the 

cultural  heritage,  as  well  as  methods  of 

preserving,  describing  and  disseminating 

It.  This  paper  explores  a  process  to  reach 

the  hannonisation  of  documentation  and 

retrieval  of  cultural  heritage  at  a  local  t^K^^^^^^ 

level.    Although  this  process  is  valid  in 

any  context,  we  will  focus  our  example  on  the  most 

complex  area  that  we  have  found:  city  planning. 

Cartagena  is  a  medium  size  city,  with  a  long  history  starting 
from  the  Caithaginian  period.  Ihe  city  owns  several 
collections  connected  to  planning  and  landscape,  although, 
as  a  sainple.  we  will  concentrate  this  presentation  on  the 
development  of  the  city  carried  out  between  1875  and  19.34. 
known,  at  a  local  level,  as  the  "Ensanche."  During  this 
period  there  was  an  effort  to  modernise  and  enrich  the  city 
with  a  strong  N'lodernist  orientation,  coincident  with  a 
period  of  industrialisation,  promoted  by  foreign  in\estors. 
in  addition,  nowadays  the  city  is  remodelling  the  old 
tnsanche.  in  such  a  way  that  the  Council  and  private 
organisations  are  generating  a  significant  volume  of  active 
records.  The  first  remodelling,  on  the  other  hand,  generated 
files,  architectural  drawings,  archival  documents, 
administrative  regulations,  as  well  as  ancillaiy  products, 
like  bibliographic  essays,  photographs,  paintings,  etc.  Of 
course,  the  different  quarters  and  buildings  are  the  main 
product  of  that  effort.  Therefore,  we  can  discriminate, 
conventionally,  two  kinds  of  collections,  "static"  and 
"dynamic": 

1.  Dynamic  -  Records,  still  active,  related  to  urban 
refomis  in  progress,  as  well  as  to  the  retrieval  of 
Carthaginian.  Roman  and  .Vlodcrnist  architectural 
heritage.  We  are  interested,  at  the  moment,  in  this  last 
period. 

2.  Static  -  Different  cultural  collections,  including 
archival  files,  plans,  drawings,  photographs,  etc.: 
museological  and  bibliographic  collections,  ancient 
serials,  some  samples  of  paintings  and  other  fine  arts, 
etc. 

.\dditionally.  not  all  of  this  documentation  is  owned  by  the 
local  government,  but  also  by  individuals,  foundations. 


by  Alejandro  Delgado-Gomez 


private  companies,  etc. 

Of  course,  these  collections  have  not 
been  preserved  and  described  in  a 
consistent  way  over  the  years.  As  an 
obvious  example,  the  active 
documentation  is  managed  through  the 
^^^^^^^■i     organisation  of  the  infoniiation  programs, 

and  the  closed  documentation  through 
more  static  databases. 

Since  Cartagena  is  a  growing  city,  oriented  towards  tourism 
and  heritage  retrieval,  and,  at  the  same  time,  with  a 
population  strongly  involved  in  his  cultural  environment, 
one  of  the  priorities  of  the  Council  is  the  documentation, 
preservation,  restoration  and.  mainly,  dissemination  of  the 
history  in  a  consistent  way. 

The  solution  we  show  uses,  as  a  pretext,  a  physical  and 
digital  exhibition  of  the  urban  history  of  the  city 
harmonising  in  such  a  way  preservation,  description  and 
dissemination  of  the  above  mentioned  matenals.  We  must 
notice,  however,  the  fact  that  the  cultural  heritage  of  the 
city,  as  well  as  the  urban  planning,  has  been  managed 
rather  poorly  for  years.  This  implies  a  constraint  in  the 
hannonisation  process.    The  first  constraint  is  the  chaotic 
situation  that  we  found  among  the  records.  The  second 
constraint  is  determining  a  way  lore-arrange  in  some 
conventional  way  the  information.  These  two  tasks  have  to 
succeed  before  moving  to  more  sophisticated  techniques 
and  procedures. 

Contents  and  organisation  of  the  information 

We  have  to  work  with  two  quite  different  kinds  of 
documents,  data  repositories  and,  as  a  consequence, 
infoiTnation: 

1.  The  Urban  Planning  Department  is  developing  a 
"new  city"  and  generating  a  great  deal  of  new 
mlbrmation.  It  is  using  a  conventional  computer 
supported  co-operative  workflow  tool,  based  on 
Microsoft  products:  Visual  Basic,  .Access  and  so  on. 
However,  because  the  Department  is  divided  into 
several  offices,  some  of  them  are  still  using  obsolete 
programs  and  tools,  for  instance,  WordPerfect  5. 1  or  486 
processors.  Because  most  of  the  staff  doesn't  have 
strong  computer  skills,  we  cannot  remove  suddenly  an 
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old,  familiar  system  to  implement  a  more  adequate 
system.  On  the  other  hand,  we  are  developmg  our  work 
in  co-operation  with  the  Data  Processing  Centre,  but  we 
are  not  responsible  for  the  Urban  Planning  Department. 
Therefore,  we  have  to  work,  simultaneously,  in  four 
different  steps: 

a)  The  oldest  databases.  We  have  to  get  a  correct 
migration  from  these  to  our  archival  system,  and  this 
implies  the  use  of  an  intermediary  program,  friendly 
for  the  civil  servants,  but  terribly  annoying  and 
certainly  u,seless  for  us. 

b)  The  CSCW.  Since  this  is  a  more  updated 
system,  we  are  using  it.  at  the  moment,  like  an 
intermediary  step,  w  ith  Uvo  aims:  on  the  one  hand,  to 
train  the  staff  in  new  uses  of  the  technology;  on  the 
other,  to  migrate  newly  created  records,  in  such  a  way 
we  can  minimise  the  "disturbing"  effect  of  the  oldest 
databases. 


ts'«>vt*L*;*..ii 
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c)  Development  of  a  modeling  system.  We  need  to 
develop  a  system  that  is  capable  of  structuring  the 
information  that  the  Urban  Planning  Department  is 
generating.  The  users  do  not  want  to  know  the 
modeling  system.  There  needs  to  be  an  interface  to  the 
system.  In  such  a  way.  we  will  migrate  data  according 
to  our  interests,  avoiding  a  negative  reaction  on  behalf 
of  the  Department.  To  reach  this  end.  we  are  using  the 
well-known  IDEF  techniques,  specifically  IDEFO  and 
iDEF5.  such  as  developed  by  KBSI. 


atgqt.wBHanajitEgja 


:Ja;Sia  3?.a  _j!sf  _j  jtu:  t-t.i  aj^ig 

.=1S»1 

i 

f' 

L,  „;  F^,-.ii.< 

.«|f 

■i 

A  A' 

*.3r^-i 

[^  a;  AfaTC^ft: 

rj^^^^-- 

1                  :i 

iU 

^ 

***.! 

,5 

soa.j. 

i-:-^    |*Orf»-A. 

^  ( tf>^->--rf  lis  tta  w 

^^t^  ',',<» 

lASSIST  Quarterly  Winter  2000 


^f«    10*    '^^■4    Om^    t^   £--i*5.    asMnt    W«fc»  (..«.                                                                  .JgJXi 

i 

1 

d 

K  Fu»:';;i/j^:                                                                          t 

1 — ^ 

iM"" '                         ' 

^ 

aw»«l  i «  s  21 ;  s^wfa-  - 1  *,«»'»- 1  ««-».•'•  IIS^ 

owi..       3.(tt«»  I7W 

Us. a 

Jl     l^jdo 

toil 

ats  ij»i«ar.= 

■!'{;,-  ■ 

laad 

i- 

i 

t 

LteBBr  -  -r   , 

n™"","  ,  , 

-.. 

^ti'r..      "'' 

— 

^ 

;         '     .r-..  -11.  ui 

•«"«° 

|»)i»»i,<nJ>aiKsa=      J 

I          '     'fe=-t 

l«i^il^ ^    S^l 

.... 

.!,) 

H '    ,r  '"i, 

t!!lJ 

fr-d 

-*••*( 

Ul_          '     ' 

13;    ,  .., 

■■ '"  ■ 

.»)rri(.,i8« 

ra 

,    ...^ 

1 

H«»<.  «**»■ 

ws 

0^^]  ea53vK  I ^-Qiaite  .j »wi^, ,  1  EJ--«.^., I  -. flhio-.    :s<jet>i*'^n 


<  ct^  a(AK«ia^«-lrdrodu:3r 


JTR^tt^fqifaiga  I 


Are*  de  destgnactta  especiTicd  de  to  c  M 


1?.  Arta  (teedm 
ll.ATMdedest) 


»  OK  dMcrtpti&i  IfftiGa 


d)  Interface.   We  will  develop  a  mask  with  the 
appearance  of  a  "Windows-based  program"  but 
actually  independent  of  any  platform. 

This  seemingly  strange  mixture  of  procedures,  techniques 
and  tools  allows  us  to  allow  for  a  structured  migration  of 
information  with  a  minimal  irnpact  on  the  staff  that  input 
the  information.  It  allows  us  to  reconcile  and  harmonise 
this  information  with  other  databases. 

2. The  Councillor  of  Culture  Office,  on  the  other 
hand,  is  promoting  the  retrieval,  arrangement  and 
dissemination  of  cultural  heritage,  in  all  its  different 
facets,  that  is  to  say:  museums,  libraries,  archives, 
buildings,  fine  arts.  etc.  Since  all  of  these  are  better- 
consolidated  areas,  the  situation  is  not  as  problematic  as 
with  the  Urban  Planning  Department.  These  institutions 
are  using  relational  databases  to  describe  their  items, 
although  not  the  same  model  of  databases:  Access.  Fo.\ 
Pro.  Dbase.  etc.     It  will  be  cisier  to  implement  changes 
here.    I'hese  staff  are  skilled  professionals  who 
understand  the  concept  of  hannonisation  and  have 
strong  computer  skills. 


However,  in  spite  of  the  co-operati\e  staff,  we  have  an 
additional  problem.  Because  of  the  relationships  between 
Urban  Planning  and  histoncal  and  cultural  iiifomiation 
there  is  a  permanent  flow  and  re-f!ow  of  documents  and 
information.  We  are  modeling  cultural  institutions  in  a 
similar  way.  in  order  to  reconcile  procedures  and 
tecliniques.  Our  aim  is  not  to  reach  one  homogeneous 
database,  but  to  allow  every  professional  to  manage  his  or 
her  own  databases.  However,  all  will  follow  similar 
procedures,  which  will  make  the  retrieval  of  information 
easier  for  the  professional,  the  intermediary  and  the  end- 
user.  All  of  this  means  that  we  have  to  deal  at  least  with 
the  following  "instances",  and  their  relationships: 

a)  Active  records,  managed  according  three  different 
means:  old  databases,  based  on  MS-DOS  platforms 
and  DBF  llles;  recent  databases,  based  on  Windows 
platfomis  and  MDB  files;  and  a  mask  to  replace  the 
former  databases. 

b)  Non-active  records  and  tiles,  consisting  of: 
-.Archival  materials,  since  1245  containing  a 
highly  dispersed  range  of  information. 
-Bibliographic  materials,  dealing  mainly  with  the 
history  of  the  urban  developinent  of  the  city. 
-Ancient  serials,  newspapers  and  other  newspaper 
materials,  also  with  a  high  level  of  dispersion,  as 
they  reported  the  first  Modernist  development  of 
the  city  day-by-day. 

-Cartographic  materials  and  other  plans,  drawings 
and  projects. 

-Ancillary  materials,  such  as  photographs, 
paintings  and  other  fine  and  decorative  arts,  etc. 
-Buildings  and  other  architectural  items,  ranging 
from  squares,  markets,  facades.  Ibuntains.  to  parts 
of  buildings  such  as  doors  or  windows,  which  are 
protected  by  the  regulations  about  cultural 
heritage. 
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Since  most  of  these  materials  arc  hosted  by  dissimilar  kinds 
of  institutions  their  records  are  collected  and  maintained 
differently.  Some  of  them  are  registered  and  described  by 
means  of  word  processors  -  others  by  means  of  different 
relational  databases.  The  highest  level  of  homogeneity  is 
being  reached  by  public  institutions,  because  of  an 
agreement  between  the  Archives  Department,  the  Data 
Processing  Centre  and  a  private  company.  In  this  way.  we 
are  solving  at  least  one  of  the  problems.  Both  the  .Archives. 
the  Libranes  and  the  Public  Museums  are  using  only  one 
programming  language  -  Visual  Fox  Pro  -  and  only  one 
associated  kind  of  relational  database.  In  addition,  they  arc 
describing  their  materials  accordint:  to  a  sin^ile  structure  - 


the  ISO  2709  standard,  making  in  this  way  the  interchange 
of  information  easier.  Each  type  of  repository  uses  the  most 
adequate  description  standards:  US.VIARC  formats. 
IS.\D(G)2,  CIDOC  standards.  This  issue  is  irrelevant,  since 
our  interest  is  to  obtain  a  homogeneous  structure  to 
interchange  information,  and  an  indexing  and  classification 
system  capable  of  retrieving  those  infomiation  from 
disperse  points.  We  are  aware  of  the  technological  poverty 
of  this  solution.  On  the  one  hand,  we  think  that,  at  least,  it 
is  realistic,  and  allow  us  to  put  a  bit  of  order  into  the  chaos. 
On  the  other,  this  solution  allows  us  to  use  a  distributed 
database,  instead  of  disperse  and  heterogeneous  data 
repositories.  Visual  Fox  Pro,  merged  with  some  other 
propnetary  applications,  allows  us  to  manage  the 
documents  and  the  infonnation  in  a  robust  and  reasonably 
tiexibic  way.  Of  course,  all  of  us  hope  this  will  be,  also,  a 
temporary  solution. 
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/\t  an  internal  level,  we  are  modelling,  as  we  said,  this 
inlbrmation.  following  the  IDEF  techniques,  and.  at  the 
same  time,  migrating  data  to  conventional  HTML  files,  in 
order  to  get  a  more  homogeneous  display.  With  regards  to 
some  other  associated  problems,  the  Archives  Department 
IS  signing  agreements  with  private  institutions  hosting 
materials,  to  manage  them. 
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Finally,  and  since,  as  we  said,  this  situation  is  provisional 
and  cannot  be  sustained  for  a  long  time,  we  have  been 
planning  a  technological  process,  currently  in  progress,  to 
ensure  a  persistent  harmonisation  of  materials  and 
associated  data  and  data  repositories,  through  the  use  of 
modelling  techniques  and  metadata  languages. 

Technological  steps 

The  following  is  a  list  of  all  the  technological  steps  in  this 
process.  Some  of  them  are  finished  while  others  are  still  in 
progress. 

1.  Digitisation  of  materials  not  digitised  yet,  or 
associated  documentation  m  the  case  of  archaeological 
and  architectural  items  Most  of  the  significant 
matenals  for  public  institutions  -  buildings, 
architectural  items  and  drawings,  archival  documents 
regarding  the  first  urban  reforms  have  been  digitised. 
This  is  not  the  case  for  private  institutions.  Therefore, 
one  priority  is  to  digitise  these  materials.  However. 
vvc  have  finished  a  complete  union  catalogue  of  the 
architectural  heritage. 

2.  ,\nalysis  of  matenals,  in  order  to  develop  a  model 
of  classification  and  indexing,  allowing  a  refined 
retrieval  in  a  subsequent  step.  Since  one  of  our  mam 
interests  is  a  sophisticated  retrieval  of  the  information, 
oriented  to  users"  needs,  not  strictly  to  contents,  a 
detailed  definition  of  the  ontology,  as  well  as  its 
entities,  attributes  and  elements  is  a  sine  qua  non 
requirement.  With  regards  to  thematic  indexation,  we 
are  using  the  OECD  Macrothesaurus,  as  it  is  simple, 
and  allows  quite  a  correct  retrieval,  taking  into 
consideration  contents  are  not  our  priman,-  interest.  ,'\ 
classification  according  to  the  users"  needs  is  more 
complicated,  as  it  implies  a  market  analysis  and  the 
use  of  statistical  and  psychological  devices.  At  the 
moment,  we  are  using  a  con\entional  solution, 
perhaps  too  easy,  but  useful:  we  are  classifying  the 
items  according  to  some  of  the  IJDC  auxiliary  tables, 
basically,  that  for  people.  In  such  a  way  we  can 
retrieve  infomiation  according  to  the  users"  age.  skills, 
education  or  planned  use  of  the  information. 
Obviously,  a  combined  search,  by  indexing  tenns  and 
users"  classification  is  also  possible.  A  more  refined 
classification  will  have  to  wait. 
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With  regards  to  the  analysis  of  the  end-users,  we  have 
added  a  module  to  create  and  control  statistical  data.  At  the 
moment  it  is  quite  simple,  but  we  are  finishing  a  new  and 
more  sophisticated  module,  and  a  model  of  survey  to  be 
incorporated  next  academic  year. 
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3.  Development  of  a  generic  relational  database  This 
will  allow  the  professionals  to  enter  any  kind  of  basic 
data.  At  a  second  level  will  allow  them  to  add  any  kmd 
of  relevant  information.  .\s  we  said  above,  we  are  able 
to  customise  these  databases,  by  creating  profiles 
according  to  the  different  kinds  of  professionals'  needs. 
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4,  Conversion  to  a  generic  metadata  language.  This  was 
quite  a  problematic  issue,  basically  because  of  the 
current  "intlation"  of  specialised  metadata  languages. 
After  reviewing  carefully  the  state-of-the-art,  we  had  to 
make  a  decision  between  two  options: 

-To  use  a  simple,  basic,  language,  to  interchange 
information.  The  most  obvious  example  is  Dublin 
Core:  but  we  realised  elements  in  Dublin  Core  were 
clearly  insutTicient  to  accommodate  an  exhaustive 
description,  necessary  in  some  cases. 

-To  use  specific  metadata  languages  for  each  type  of 
data  repository.  For  instance,  CHIO  for  museums, 
EAD  for  archives,  MARC-DTD  for  libraries,  TEl  for 
publishing  departments,  and  so  on;  but  this  option  was. 
simply,  unmanageable,  and,  in  addition,  we  ran  the  nsk 
of  returning  to  the  initial  chaotic  situation.  We  took 
into  consideration  the  use  of  only  one  specific 
language,  in  different  contexts  -maybe  ILSES  or  EAD- 
,  but.  even  so.  both  of  them  are  difficult  to  reconcile 
with  active  records  description,  that  requires  quite  a 
qualitatively  different  method,  such  as.  for  instance,  an 
adaptation  of  GILS  or  DDI. 

Finally,  we  refused  these  options,  and  chose  an  easier 
one.  Since  cultural  heritage  databases  are  using  a 
standardised  language.  MARC,  easy  to  convert,  and 
even  the  actise  records  are  gomg  to  finish  their  life- 
cycle  in  the  archives,  we  are  working,  simply,  with 
XML  and  associated,  and  displaying  the  information  by 
means  of  HTML  and  associated. 


.»»     ^    t,««'..^     ^       ' 

,B5=2™a^ 

m 

ii 

_        t,g. 

7'       '"^ 

CD     OJ     E- 

t»»<^.[j 

J&«^«i»)to    «jH 

„.      ►   .. 

d    ^.. 

.'"»*< « 

«*»j»u<.  e]w 

»_;,«B.».V.-1     CI. 

■"™« 

fl  ■- 

.,„-    ,t.,^,--. 

iJ 

=  1"  - 

d 

ur«r.1.±-' 

J-<?fT>9  0 

M.«™            J 

1-       d 

i-          d 

|.^, 

il 

|S.r»o 

■o,.»o.       ^ 

1-       d 

,c=mpo        d 

1-       d 

liro.,, 

A 

h       d 

1 

[Fetchoo 

.,,co^.       d 

1-         d 

&.-,<»  id 

!    1    ■m>^~' 

*»**{ .  S  Q  a ,  i  Baat"'.!  Ao-'tefc.  .Jit  lite-  M...  «»v~«B  I  :a4:*-»  1 


a^Jin,.     £Asi.     i!*     f«>w«t.     h-«»««»:     s-Art^ 

fS/^    g; 

53 

ff.tje| 

1^" 

S.  ■    ■■ '  w  JL,  ^;il  bJL 

^,^i.,j„  ,..  .„.,..  „... 

d 

ro^s  al6««iO,iTO».    »JM>»»«    (Ilt.»-»,l«™ 

*    B^;.-,  «»■*(■«■«     C^ 

■*»*                  " 

- 

522LJ     ™""""°j;^rir7e?ro* 

o                     R»g(^trtJ: :? 

.  P4ql«: 

3 

r  ^i.-^^:^"-'"-^'"^-"^ 

i 

r  ^  e  ™,«im..™ '  iM,:..  f.».,BK«r 

„h.^*^,«-«. 

M 

r   ^  _,.  =»,„  », .™  ™.,o... ,  ,,„„.  .^«,o..  r™,^,  ^„,„ 

°  ii'f  1       ''^'' 

u 

.'tf-oasiB^' iA»''l#''''-iB';wfeYt^jii»*tii*itte   "ia*&*-f»w. 


O 


M/^-O       £«04C       i'- 

tWQMB     fc-».lov«w 

* 

^ 

i 

9.  .^,  2. 

^ 

^ 

J 

&K«».lJ 

d 

™uc     »)«»«   ^ 

da     «j   if-WjjrtJT    (1 

^    i.    J     *. 

TKlVW 

fl&  e  iTO 

wwK 

5.  Design  of  a  search  engine  capable  of  filtenng  the 
infomiation,  not  according  to  its  contents,  but 
according  to  the  users"  interests.  At  the  moment  we 
have  to  work  with  conventional  tools:  the  IFLA 
Guidelines  to  display  information  through  OPACs.  and 
the  ANSI.  N ISO  Z39.71  standard  to  display  holdings. 
We  can  customise  also  this  search  engine,  according  to 
different  users'  needs,  as  well  as  the  printed  reports. 
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6.  Develop  satisfactory  "visual"  outputs.  We  are 
dealing  with  a  thematic  digital  library,  with  a  large 
visual  component.  Thus,  we  must  also  work  on 
■"visual"  outputs,  both  to  satisfy  the  end-users  and  the 
mvolved  professionals,  using  ourselves  VRML 
technologies,  and  allowing  them  to  use  authonng  tools. 
.^t  the  moment,  we  have  to  be  simple.  Each  item  has  a 
description  and  associated  with  it,  one  or  more 
"contextual  files",  depending  on  their  relevance:  static 
images,  sound,  video  or  text. 

7.  If  the  mentioned  steps  are  developed  correctly  - 
and  at  the  moment  that  is  the  situation-  we  will  be  able 
to  elaborate  any  kind  of  digital  or  physical  output  - 
website,  intranet,  kiosk.  OPAC.  DVD.  webTV,  printed 
materials...-  using  consistent  and  conventional 
technologies.  .A.nyway,  outputs  are  not  a  problem  for 
us,  if  we  can  harmonise  different  databases  in  a 
distributed  virtual  database.  In  fact,  even  although  we 
will  have  to  fight  still  against  the  chaos  for  a  long 
while,  we  are  developing  the  website,  the  Intranet  and 
the  OPACs,  based  on  the  current  achievements.  These 
outputs,  at  the  moment  in  progress,  will  replace  the 
current  tools  at  the  users"  disposal:  a  poor  website,  a 
rather  unfriendly  OPAC,  and  a  TV  channel  lacking 
information.  We  hope  we  will  be  able  to  put  into 
operation  some  of  the  outputs  by  summer,  and  the 
project  will  be  finished  m  no  more  than  one  year. 

*  Paper  prepared  for  the  lASSIST'lFDO  Conference, 
.Amsterdam  2001.  .Alejandro  Delgado-Gomez,  Cartagena 
City  Council-.Archives.  Publications,  Libraries  and 
Information  Science  Department.  Phone:  0034  968  128855, 
Fa.x:  ()0.'i4  968  128856.  archivo(<(  avto-cartagena.es 


lASSIST  Quarterly  Winter  2000 


ISSP  DataWizard  -  Computer  Assisted 

Merging  and  Archiving  of  Distributed 

International  Comparative  Data 


Since  1985  the  International  Social 

Survey  Programme  (ISSP)  has  conducted 

annual  social  surveys  in  the  participating 

countries  covering  relevant  topics  from 

the  Social  Sciences.  The  ISSP  was 

founded  in  the  early  eighties  by  four 

Social  Science  Research  Institutes  for  the 

purpose  of  adding  an  international  ^^^^B 

comparative  aspect  to  the  existing 

national  social  survevs.  The  foundinu  members  were: 


hy  Robert  Strotgen  &  RolfUher 


background  variables,  and 

4,  make  the  data  available  to  the 

social  science  community  as  soon  as 

possible. 


As  of  2001  the  number  of  participants  has 
grown  from  the  original  four  countries  to 
38  countries  worldwide.  The  map  shows 


the  geographical  distnbution  of  the  current  ISSP  members. 


Country 

Institute 

Sur^ey 

USA 

NORC 

National  Opinion 
Research  Center, 
Universitv  of 
Chicago 

CSS 

General  Social  Sur\ev 

Germany 

ZUMA 

Zentram  fir  Umtragen, 

Methoden 

und  Analysen 

ALLBUS 
Allgemeine 
Be\'dkeningsumfrage 
\  Sozialwissensdiaf  ten 

Great 
Britain 

The  National  Center 
for  Social  Science 
(former  Social  and 
Communitv  Planning 
Research  -'  SCPR), 
London 

BSA 

British  Social 

Attitudes 

Australia 

RSSS 

Research  School  of 

Social  Sciences, 

Australian 

National 
Universitv, 
Canberra' 

NSSS 

National  Social 
Science  Survey 

W  ^'"    m 


The  ISSP  agreed  to  four  general  principles: 

1 .  jointly  develop  topical  modules  dealing  with 
important  areas  of  social  science 

2.  field  the  modules  as  a  fifteen-minute 
supplement  to  the  regular  national  surveys  (or  a 
special  survey  if  necessary) 

3.  include  an  extensive  common  core  of 


e  member  states  are: 

Australia 

Japan 

Austria 

Latvia 

Bangladesh 

Mexico 

Brazil 

Netherlands 

Bulgaria 

New  Zealand 

Canada 

Norwav 

Chile 

Philippines 

Cvprus 

Poland 

Czech  Republic 

Portugal 

Denmark 

Russia 

Finland 

Slovakian  Republi 

Flanders 

Slovenia 

France 

Spain 

Germanv 

South  Africa 

Great  Britain 

Sweden 

Hungaiy 

Switzerland 

Ireland 

Taiwan 

Israel 

USA 

Italv 

Venezuela 
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The  first  ISSP  Suney  in  1985  was  conducted  in  6 
countncs.  the  origmal  4  plus  Italy  and  Austria  and  had  the 
topic  'Role  of  Government".  The  following  topics  have 
been  fielded  since  then  or  are  being  planned  for  the  near 
fuUire: 


Topics  1985  -  2004 

*  Role  of  Government 

>  1985,  1990, 

1996 

*  Social  Networks 

>  1986.  2001 

*  Social  Inequality 

>  1987,  1992, 

1999 

*  Family 

>  1988,  1994, 

2002 

*  Work  Orientations 

>  1989,  1997 

*  Religion 

>  1991,  1998 

*  Environment 

>  1993,  2002 

*  National  Identity 

>  1995,  2003 

*  Citiizenship 

>  2004 

Replicating  topics  over  titne  has  enriched  the  international 
comparative  aspect  by  adding  a  time-series  component. 

.\t  the  1986  ISSP  meeting  in  Mannhenn  the  Zcnlralarchiv' 
was  chosen  as  the  ".Archive  of  the  ISSP".  The  tasks  ot'the 
Zenti-alarehiv  are  to  archive,  check  and  maintain  data  and 
documentation  of  the  countr>'-specific  studies  and  distribute 
it  to  the  scientific  community  Following  the  structure  of 
the  questionnaire  and  die  set  of  standard  background 
variables,  an  mtcntational  comparative  data  set  is  prepared 
accompanied  by  extensive  and  detailed  comparative 
documentation.  This  documentation,  also  known  as  a 
codebook,  includes  all  inforination  that  is  necessary  to 
analyze  and  inteqiret  the  data  set.  In  addition  to  the 
methodological  and  technical  description  of  the  suney  the 
codebook  includes  the  complete  questions  and  answer 
categories,  the  frequency-distributions  broken  down  by 
countries,  and  also  country-specific  details  like  deviations 
from  the  agreed  standard,  problems  with  translations  of 
indicators  and  the  like. 

The  first  step  in  creating  the  international  file  is  the 
production  of  a  "Standard  Setup'.  This  includes  the  desired 
structure  of  the  integrated  file,  the  variable-names, 
variable-labels,  codes,  value-labels  and  the  definition  of  the 
missing  values.  Fhe  starting  point  for  the  production  of  the 
"Standard  Setup'  is  the  basic  questionnaire  of  the  respective 
ISSP  module  and  the  set  of  standard  background  variables 
defined  for  the  ISSP. 

Even  though  this  "Standard  Setup"  is  being  distributed  to 
the  participating  countries  in  advance,  a  number  of  cases 
ha\  e  remarkable  dc\  iations  from  the  desired  standard, 
which  must  be  considered  in  the  process  of  merging  the 
countiy-data  lo  the  integrated  lllc  prior  to  archiving  the 
data. 


Merging  Process 

Standard  Setup 
SPSS 

1 

1 

Country  1 
setup 
Counti-y  1 
data 

Country  2 
setup 
Country'  2 
data 

Country  3 
setup 
Country  3 
data 

Starting  with  6  country  data  sets  m  1985,  by  1998  the  ISSP 
had  grown  to  include  28  different  data  sets,  dramatically 
increasing  the  workload  for  preparing  the  merged  file. 


#  of  data-sets  to  be  merged  per  year 


85    86    87    88    89    90    91    92    93    94    95    96    97    9i 

The  amount  of  work  that  is  necessary  lo  hamionize  one 
country-data-set  to  the  standard  can  be  measured  by  the 
amount  of  different  documents  that  have  to  be  viewed  and 
used  in  order  to  understand  all  the  details  of  the  coLmtr>  - 
specific  conditions. 

Documents  needed  for  processing; 

*  ISSP  Basic  questionnaire 

*  Standard  setup 

*  Standard  background  variable 

*  ("odebooks  of  earlier  ISSP  Modules 

*  Frequencies  of  original  data 

*  Original  questionnaire 


■  countries 

til 
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*  Dictionaries 

*  English  documentation 

*  Re-coding  documentation 

*  Frequencies  of  re-coded  data 

*  Other  sources:  internet.  ISCO, 
statistical  vearbooks.  etc. 


Application.-  The  use  of  a  platfonn  independent 
application  is  an  advantage,  particularly  when  distnbutmg 
the  tool  to  ISSP  project  partners  in  different  countries. 
Although  the  data  are  stored  in  an  Oracle  database,  the  tool 
can  be  used  just  as  readily  on  any  relational  JDBC-capable 
database  ser\  er. 


The  predefined  standard  is  bemg  considered  in  different 
degrees  of  quality.  The  number  of  recode-statements  to 
harmonize  the  country  data  sets  to  the  final  standard  vanes 
from  50  to  300,  in  one  case  over  1000  recode-statements 
were  needed  to  merge  the  file.  In  all  of  these  cases  each 
recode-statement  must  be  checked  and  proofed  to 
detcrmme  whether  the  desired  results  have  been  attained. 

The  time  and  resource  consuming  nature  of  this  process 
lead  to  ideas  for  developing  a  tool  that  would  efficiently 
support  the  harmonizing  process.  Tlie  instrument  needed  to 
compare  the  pre-defined  standard  u  ith  the  actual  country 
specific  setup  with  the  following  processing  critena.  The 
first  comparison  should  be  done  automatically  resuhing  in  a 
list  of  all  deviations.  The  deviations  in  the  setup  and  in  the 
dataset  itself  are  then  corrected  through  individual 
receding.  The  interface  for  this  process  must  be  user- 
friendly.  The  system  should  create  a  detailed  report  of  all 
steps  in  the  process. 

LSSP  Data  Wizard: 

*  Maps  original  with  standard  setup 

*  Provides  a  comparative  view 

*  .A.ssigns  variables  and  \alues  t  osUmdard 

*  Re-codes  data  to  standard 

*  Alovvs  for  individual  data-processing 

*  Reports  processing  steps 

The  concept  of  the  ISSP  DataWizard  is  being  developed 
cooperatively  by  the  GESIS  (Gennan  Social  Science 
Infrastaiciure  Services)  institutes  of  the  Central  Archive  for 
Empirical  Social  Research  (Z.Al  in  Cologne  and  the 
German  Social  Science  Information  Center  (IZ)  m  Bonn. 

The  first  "real"  application  after  the  explicit  test-phase  will 
be  done  with  the  2000  ISSP  module  on  'Environment  IE. 
In  a  further  stage  of  development  the  ISSP  DataWizard  will 
be  prepared  for  distributed  use  so  that  the  ISSP  participants 
can  prepare  their  data-sets  at  their  home  institutions.  Thus 
the  expectation  is  high  that  the  quality  of  the  data  deli\ered 
to  the  archive  will  be  significantly  higher  according  to  the 
pre-defined  standard,  thereby  facilitating  the  final  steps  of 
merging  the  files  to  a  common  international  data-set. 

Implementation  of  the  ISSP  DataWizard 

The  ISSP  DataWizard  is  implemented  at  the  German  Social 
Science  Infonnation  Center  (17.)  as  a  Java- Swing 


Great  importance  is  attached  to  providing  an  ergonomic 
user  interface  in  order  to  provide  users  with  a  sound  level 
of  support,  focus  their  attention  on  the  main  aspects  of  their 
work  and  not  burden  them  with  the  unimportant  areas  or 
information.  The  WOE  model  («tool  metaphor  based 
strictly  object  orientated  graphic  direct  manipulative  user 
interface)))  is  a  set  of  software  ergonomic  proposals  which. 
in  their  entirety,  were  designed  to  create  efficient  and 
«natural»  user  interfaces  (Krause  1995).  The  de\elopmcnt 
of  the  ISSP  DataWizard  was  orientated  on  this  model. 

Based  on  the  analyses  of  current  work  processes  carried  out 
cooperatively  by  ZA  and  IZ.  an  attempt  has  been  made  to 
optimize  support  for  the  steps  involved  in  processing  the 
ISSP  modules.  In  this  context,  the  Wizard  manages 
modules  covering  the  associated  default  and  national 
setups,  country-specitlc  study  descriptions  and  survey 
records.  (See  Fig  1  :  Managing  snidv  descriptions  with  the 
ISSP  DataWizard) 

An  import  and  export  interface  provides  the  capability  of 
reading  and  writing  SPSS  setup  and  data  files.  The  open 
XML  standard  of  the  Data  Documentation  Initiative'  is 
also  supported.  This  permits  the  use  of  a  flexible  and 
maker-independent  data  format  and  also  enables  a 
straightforward  exchange  of  data  with  other  tools 
supporting  this  standard. 

The  setup  that  defines  variables  and  values  with  codes, 
labels  and  comments  -  thus  matching  the  questionnaire  - 
can  be  viewed,  edited  or  even  re-created  using  the 
DataWizard.  It  is  also  possible  to  alter  the  sequence  of 
\  anables  and  \  alucs.  Tliis  can  be  done  both  for  the  default 
questionnaire  of  a  module  as  well  as  for  countrv-specific 
questionnaires  that  implement  this  standard.  In  addition  to 
making  ennies  in  the  editor,  users  can  also  adopt  variables 
and  values  from  other  modules,  e.g.  in  the  case  of 
permanent  demographic  vanables.  where  questions  are 
asked  that  pertain  to  a  module  w  ith  a  subject  area  covered 
in  a  preceding  module,  or  in  conjunction  with  values  on 
scales  that  are  used  for  several  \anables.  It  is  also 
conceivable  for  partners  to  copy  the  entire  default  setup  and 
make  the  requisite  country-specific  changes  to  the  copied 
version  so  as  to  reduce  the  number  of  transfer  errors,  such 
as  scale  reversals,  etc.  (See  Fig  2  :  Managing  setups  nith 
the  ISSP  DataWizard) 

The  central  function  of  the  ISSP  DataWizard  is  to  compare 
countr>--specific  setups  with  the  default  setup.  To  this  end. 
the  work  carried  out  "intellectually"  at  the  ZA  has  been 
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Fig  1.:  Managing  study  descriptions  witli  tlie  ISSP  DataWizard 


analyzed  and  -  in  the  simple  eases  -  translated  into  mles 
wherever  possible.  An  automatic  mapping  process,  for 
instance,  idenlillcs  reversed  scales,  incorrect  variable  codes 


or  similar  problems  and  marks 
them  as  errors.  Variables  and 
values  capable  of  being  assigned 
beyond  doubt  are  marked  «0K». 
and  dubious  assignments 
highlighted  for  mtellectual 
investigation.  In  the  process  of 
intellectual  post-processing, 
which  cannot  be  avoided,  these 
problem  cases  are  highlighted 
(e.g.  using  colour  markings  and 
an  error  browser)  whereas 
unequivocal  assignments  require 
no  further  attention.  The  editor 
for  post-processing  compares  a 
country-specific  setup  with  its 
allocated  default  setup  and 
synchronizes  the  display  in  order 
to  quickly  provide  a  clear  view  of 
the  information  relevant  to  a 
V  anable.  Assignments  can  be 
made  from  context-related 
selection  lists,  thus  reducing  the 
likelihood  of  making  incorrect 
entries  and  easing  the  demands 
on  the  user's  attentiveness.  This 
way,  it  is  also  possible  to  adopt 
values  into  the  default  setup  that  have  been  «ovcrlooked» 
for  a  particular  country  without  having  to  switch  to  the  part 
for  processing  the  default  setup. (.Vef  Fig  3  :  Post-ediiuii^ 

setup  assignments  using  the  ISSF 
!hiiulVi:unl) 


mvmri 


«».<*«^«  W!^«  W*ffW^*  tv^Sft"^! 
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Fig  2.:Managing  setups  with  tiie  ISSP  DataWi/ard 


The  database  archives  both  the 
reference  setups  of  ISSP  partners 
as  well  as  the  edited  versions. 
This  means  it  is  possible  at  any 
time  to  access  the  data  submitted 
and  check  the  adaptations  made. 
In  addition,  it  is  possible  to 
document  all  checks  and 
assignments  in  a  text  tile  that 
additionally  contains  a  complete 
concordance  between  a  cotintry- 
specific  setup  and  the  relevant 
default  setup.  This  creates 
transparency  in  tenns  of  merging 
and  mapping:  any  suspected 
eiTor  can  be  reliably  investigated. 

.Adaptations  to  the  setups  (to  the 
structure  of  the  data  collected) 
inust  also  be  applied  to  the  data 
itself.  Divergent  value  codes, 
reversed  scales,  etc.  must  -  as 
adapted  in  the  setup  -  also  be 
corrected  in  the  data.  The 
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necessary  re-coding  of  data  takes 
place  automatically  on  the  basis 
of  the  mapped  setup.  Here  too. 
the  database  archives  the  data 
originally  submitted  as  well  as 
edited  version. 

It  is  also  possible  to  employ 
customized  rules  in  order,  for 
example,  to  examine  the 
consistency  of  filter  queries  or  to 
rc-code  country-specitlc  indices 
to  a  default  index,  provided  this 
is  possible  without  intellectual 
scrutiny.  Users  are  provided  with 
complex  tools  for  creating  and 
applying  these  niles.  {See  Fig  4 
Simple  data  visualization  with 
the  ISSP  DataWizardI 


i«Jj.|JI'l!.lUH 


/' 
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Finally,  users  have  the  capabilit\- 

of  xiewing  data  in 

straightforward  counts  that  allow 

them  to  carry  out  simple  checks. 

e.g.  what  age  distributions  to 

expect,  etc.  Here,  it  is  also 

possible  to  compare  the  reference  data  version  with  the 

edited  version  and  thus  identify  the  result  of  the  mapping 

and  re-coding  process.  However,  these  capabilities  only 

ser\c  to  monitor  the  success  of  merging;  the  actual  data 

analysis  is  still  to  be  carried  out  using  the  nonnal  statistics 

program  packages. 


Whereas  the  current  version  of 
the  ISSP  DataWizard  was 
primarily  developed  to  support 
the  ZA  in  merging  international 
data,  future  versions  will  have  to 
be  developed  towards  the  work 
distribution  between  Z.\  and 
ISSP  project  partners.  It  will  be 
necessary  to  resolve  questions  of 
data  exchange,  access  rights  etc. 

Conclusion 

The  ISSP  DataWizard  represents 
a  tool  that  reduces  the  effort 
involved  in  merging  ISSP 
records  as  well  as  the  possibility 
of  errors  by  automating  simple 
activities  and  offering  suppon  to 
users  in  the  necessary  process  of 
intellectual  post-editing.  Once 
the  tool  has  proven  its  worth  m 
practice,  it  must  be  further 
improved  and  optiinized  towards 
the  requirements  of  the 


Fig  3. Post-editing  setup  assignments  using  tiie  ISSP  DataW  izard 


applications  for  which  it  is  used.  .A.  general  extension  of  the 
methods  used  beyond  the  ISSP  context  for  similar 
application  areas  cannot  be  ruled  out. 


A  conceivable  area  for  future  development  is  the  addition 
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of  funher  elements  of  artificial  intelligence  to  the  ISSP 
DaiaWizard  that  do  not  focus  solely  on  the  data  structure  of 
default  setup.  For  instance,  plausibility  rules  could  be 
introduced  that  contain  the  anticipated  spread  of 
characteristics  and  provide  warnings  where  country- 
specific  data  stray  beyond  the  given  confidence  intervals. 
This  way.  the  user  would  be  drawn  to  critical  points  and 
could  consider  taking  a  closer  look  at  the  data  and 
documents.  Developments  of  this  type  will  need  to  be  the 
subject  of  funlier  concept  formulation. 
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